Learning to Compress Ergodic Sources
نویسندگان
چکیده
We present an adaptive coding technique which is shown to achieve the optimal coding in the limit as the size of the text grows, while the data structures associated with the code only grow linearly with the text. The approach relies on Huffman codes which are generated relative to the context in which a particular character occurs. The Huffman codes themselves are inferred from the data that has already been seen. A key part of the paper involves showing that the loss per character incurred by the learning process tends to zero as the size of the text tends to infinity. This involves an analysis in an on-line learning framework bounding the cumulative loss, where loss is defined to be the excess code length. By using the Bayes prediction distribution and code the expected loss per character converges to zero at the best possible rate of O(log n/n). By allowing the length of contexts to grow in response to commonly occurring subsequences, the coding is efficient precisely where it needs to be, hence achieving a high compression rate at a relatively low overhead in terms of data structure storage. *Joint affiliation with Department of Mathematics, London School of Economics and Department of Computer Science, Royal Holloway, University of London 1068-0314/96$5.0001996IEEE 423 Proceedings of the 1996 Data Compression Conference (DCC) 1068-0314/96 $10.00 © 1996 IEEE
منابع مشابه
Nonparametric Estimation and On-Line Prediction for General Stationary Ergodic Sources
We propose a learning algorithm for nonparametric estimation and on-line prediction for general stationary ergodic sources. The idea is to prapare many histograms and estimate the probability distribution of the bins in each histogarm. We do not know a priori which histogram expresses the true distribution: if the histogram is too sharp, the estimation captures the noise too much (overestimatio...
متن کاملThe ergodic decomposition of asymptotically mean stationary random sources
It is demonstrated how to represent asymptotically mean stationary (AMS) random sources with values in standard spaces as mixtures of ergodic AMS sources. This an extension of the well known decomposition of stationary sources which has facilitated the generalization of prominent source coding theorems to arbitrary, not necessarily ergodic, stationary sources. Asymptotic mean stationarity gener...
متن کاملIndividual ergodic theorem for intuitionistic fuzzy observables using intuitionistic fuzzy state
The classical ergodic theory hasbeen built on σ-algebras. Later the Individual ergodictheorem was studied on more general structures like MV-algebrasand quantum structures. The aim of this paper is to formulate theIndividual ergodic theorem for intuitionistic fuzzy observablesusing m-almost everywhere convergence, where m...
متن کاملcient Lossless Compression of Trees and Graphs
In this paper, we study the problem of compressing a data structure (e.g. tree, undirected and directed graphs) in an eecient way while keeping a similar structure in the compressed form. To date, there has been no proven optimal algorithm for this problem. We use the idea of building LZW tree in LZW compression to compress a binary tree generated by a stationary ergodic source in an optimal ma...
متن کاملعنوان : Comparing the effect of warm moist compress and Calendula ointment on the severity of phlebitis caused by 50% dextrose infusion: A clinical trial
چکیده: Background: One of the important hypertonic solutions is 50% dextrose. Phlebitis is the most common complication of this solution, the management of which is quite necessary. Regarding this, the present study aimed to compare the effect of warm moist compress and Calendula ointment on the severity of phlebitis caused by 50% dextrose infusion. Methods: This clinical trial was conducted on...
متن کامل